Decoding Optimization for Chinese-English Machine Translation via a Dependent Syntax Language Model

نویسندگان

Ying Liu

Zhengtao Yu

Tao Zhang

Xing Zhao

چکیده

Decoding is a core process of the statistical machine translation, and determines the final results of it. In this paper, a decoding optimization for Chinese-English SMT with a dependent syntax language model was proposed, in order to improve the performance of the decoder in Chinese-English statistical machine translation. The data set was firstly trained in a dependent language model, and then calculated scores of NBEST list from decoding with the model. According to adding the original score of NBEST list from the decoder, the NBEST list of machine translation was reordered. The experimental results show that this approach can optimize the decoder results, and to some extent, improve the translation quality of the machine translation system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

A Decoder for Syntax-based Statistical MT

This paper describes a decoding algorithm for a syntax-based translation model (Yamada and Knight, 2001). The model has been extended to incorporate phrasal translations as presented here. In contrast to a conventional word-to-word statistical model, a decoder for the syntaxbased model builds up an English parse tree given a sentence in a foreign language. As the model size becomes huge in a pr...

متن کامل

The ILLC-uva SMT system for IWSLT 2010

In this paper we give an overview of the ILLC-UvA(Institute for Logic, Language and Computation University of Amsterdam) submission to the 7th International Workshop on Spoken Language Translation evaluation campaign. It outlines the architecture and configuration of the novel feature we are introducing: a syntax-based model for source-side reordering via tree transduction. We have concentrated...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Automatic Category Label Coarsening for Syntax-Based Machine Translation

We consider SCFG-basedMT systems that get syntactic category labels from parsing both the source and target sides of parallel training data. The resulting joint nonterminals often lead to needlessly large label sets that are not optimized for an MT scenario. This paper presents a method of iteratively coarsening a label set for a particular language pair and training corpus. We apply this label...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Decoding Optimization for Chinese-English Machine Translation via a Dependent Syntax Language Model

نویسندگان

چکیده

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

A Decoder for Syntax-based Statistical MT

The ILLC-uva SMT system for IWSLT 2010

A Hybrid Machine Translation System Based on a Monotone Decoder

Automatic Category Label Coarsening for Syntax-Based Machine Translation

عنوان ژورنال:

اشتراک گذاری